Multi-Attribute Factorized Hidden Layer Adaptation for DNN Acoustic Models

نویسندگان

  • Lahiru Samarakoon
  • Khe Chai Sim
چکیده

Recently, the Factorized Hidden Layer (FHL) adaptation is proposed for speaker adaptation of deep neural network (DNN) based acoustic models. In addition to the standard affine transformation, an FHL contains a speaker-dependent (SD) transformation matrix using a linear combination of rank-1 matrices and an SD bias using a linear combination of vectors. In this work, we extend the FHL based adaptation to multiple variabilities of the speech signal. Experimental results on Aurora4 task show 26.0% relative improvement over the baseline when standard FHL adaptation is used for speaker adaptation. The Multiattribute FHL adaptation shows gains over the standard FHL adaptation where improvements reach up to 29.0% relative to the baseline.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Context Adaptive Neural Network for Rapid Adaptation of Deep CNN Based Acoustic Models

Using auxiliary input features has been seen as one of the most effective ways to adapt deep neural network (DNN)-based acoustic models to speaker or environment. However, this approach has several limitations. It only performs compensation of the bias term of the hidden layer and therefore does not fully exploit the network capabilities. Moreover, it may not be well suited for certain types of...

متن کامل

Learning Factorized Transforms for Unsupervised Adaptation of LSTM-RNN Acoustic Models

Factorized Hidden Layer (FHL) adaptation has been proposed for speaker adaptation of deep neural network (DNN) based acoustic models. In FHL adaptation, a speaker-dependent (SD) transformation matrix and an SD bias are included in addition to the standard affine transformation. The SD transformation is a linear combination of rank-1 matrices whereas the SD bias is a linear combination of vector...

متن کامل

Rapid adaptation for deep neural networks through multi-task learning

We propose a novel approach to addressing the adaptation effectiveness issue in parameter adaptation for deep neural network (DNN) based acoustic models for automatic speech recognition by adding one or more small auxiliary output layers modeling broad acoustic units, such as mono-phones or tied-state (often called senone) clusters. In scenarios with a limited amount of available adaptation dat...

متن کامل

A study of speaker adaptation for DNN-based speech synthesis

A major advantage of statistical parametric speech synthesis (SPSS) over unit-selection speech synthesis is its adaptability and controllability in changing speaker characteristics and speaking style. Recently, several studies using deep neural networks (DNNs) as acoustic models for SPSS have shown promising results. However, the adaptability of DNNs in SPSS has not been systematically studied....

متن کامل

Robust i-vector based adaptation of DNN acoustic model for speech recognition

In the past, conventional i-vectors based on a Universal Background Model (UBM) have been successfully used as input features to adapt a Deep Neural Network (DNN) Acoustic Model (AM) for Automatic Speech Recognition (ASR). In contrast, this paper introduces Hidden Markov Model (HMM) based ivectors that use HMM state alignment information from an ASR system for estimating i-vectors. Further, we ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016